SimBa: An Efficient Tool for Approximating Rips-Filtration Persistence via Simplicial Batch-Collapse
نویسندگان
چکیده
In topological data analysis, a point cloud data P extracted from a metric space is often analyzed by computing the persistence diagram or barcodes of a sequence of Rips complexes built on P indexed by a scale parameter. Unfortunately, even for input of moderate size, the size of the Rips complex may become prohibitively large as the scale parameter increases. Starting with the Sparse Rips filtration introduced by Sheehy, some existing methods aim to reduce the size of the complex so as to improve the time efficiency as well. However, as we demonstrate, existing approaches still fall short of scaling well, especially for high dimensional data. In this paper, we investigate the advantages and limitations of existing approaches. Based on insights gained from the experiments, we propose an efficient new algorithm, called SimBa, for approximating the persistent homology of Rips filtrations with quality guarantees. Our new algorithm leverages a batch collapse strategy as well as a new sparse Rips-like filtration. We experiment on a variety of low and high dimensional data sets. We show that our strategy presents a significant size reduction, and our algorithm for approximating Rips filtration persistence is order of magnitude faster than existing methods in practice. 1998 ACM Subject Classification F.2.2 Nonnumerical Algorithms and Problems.
منابع مشابه
Stable Signatures for Dynamic Metric Spaces via Zigzag Persistent Homology
When studying flocking/swarming behaviors in animals one is interested in quantifying and comparing the dynamics of the clustering induced by the coalescence and disbanding of animals in different groups. Motivated by this, we study the problem of obtaining persistent homology based summaries of time-dependent metric data. Given a finite dynamic metric space (DMS), we construct the zigzag simpl...
متن کاملPersistent homology in graph power filtrations
The persistence of homological features in simplicial complex representations of big datasets in R n resulting from Vietoris-Rips or Čech filtrations is commonly used to probe the topological structure of such datasets. In this paper, the notion of homological persistence in simplicial complexes obtained from power filtrations of graphs is introduced. Specifically, the rth complex, r ≥ 1, in su...
متن کاملApproximating persistent homology for a cloud of $n$ points in a subquadratic time
The Vietoris-Rips filtration for an n-point metric space is a sequence of large simplicial complexes adding a topological structure to the otherwise disconnected space. The persistent homology is a key tool in topological data analysis and studies topological features of data that persist over many scales. The fastest algorithm for computing persistent homology of a filtration has time O(M(u) +...
متن کاملApproximate Cech Complexes in Low and High Dimensions
Čech complexes reveal valuable topological information about point sets at a certain scale in arbitrary dimensions, but the sheer size of these complexes limits their practical impact. While recent work introduced approximation techniques for filtrations of (Vietoris-)Rips complexes, a coarser version of Čech complexes, we propose the approximation of Čech filtrations directly. For fixed dimens...
متن کاملApproximate Čech Complex in Low and High Dimensions
Čech complexes reveal valuable topological information about point sets at a certain scale in arbitrary dimensions, but the sheer size of these complexes limits their practical impact. While recent work introduced approximation techniques for filtrations of (Vietoris-)Rips complexes, a coarser version of Čech complexes, we propose the approximation of Čech filtrations directly. For fixed dimens...
متن کامل